% Unify.tex
% vim:set ft=tex spell:

\documentclass{article}

\usepackage {xcolor}
\newcommand \todo[1] {\textcolor{red}{TODO: }#1}

\newcommand \catname[1] {{\normalfont\textbf{#1}}}
\newcommand \Set {\catname {Set}}
\newcommand \Cat {\catname {Cat}}

\begin{document}

\section {Intro}
Normally unification is presented in catergory-theoretic terms as a coequalizer:

we have an indexed set of equations $s_i = t_i$ where the $\{s_i, t_i\}$ are terms.
Then we ask the algorithm for a most generic unifier $q$.

\todo {describe how the unification works.}

\section {Observation 1}
The first observation is that why not only deal with substitutions only? I.e. invent a finite set of variable names $vl_i$ and $vr_i$ and let those point to $s_i$ resp. $t_i$. Then we have a bona fide substitution $V \to T_{\Omega}W$, where $W$ is the variable set of our terms $\{s_i, t_i\}$. This appears to be one of the ingredients of the coequalizer approach. We pretend to not unify two (sets of) terms, but two corresponding Kleisli-arrows $vl_i \mapsto s_i$ and $vr_i \mapsto t_i$ (which are substitutions).

\section {Observation 2}
The second observation is that having substitutions with just variables as the substituend kills the symmetry in the unification algorithm. (HOW SO?) So we'll forbid $v \to w$ substitutions altogether. Now when we encounter a unification necessity $v \equiv w$ what should we do?
My suggestion is to create an equivalence class of variable names that contains both $v$ and $w$. We'll revisit this issue later but for now it is enough to hint at path-connectedness (i.e. \emph{groupoids}).

Then the domain of our Kleisli-arrows must be equivalence classes of variable names. Since a single variable name in $T_{\Omega}W$ uniquely determines the equivalence class of variable names, why not also pass to quotients in $T_{\Omega}W$?
We'll do so now. What we get is monads over equivalence classes (quotiened sets). This also means that the number of bindings in unifiers will potentially decrease, because
\begin{itemize}
\item[a)] we do not have variable-to-variable substitutions any more,
\item[b)] we'll have only one instead of $\{a \to x, b \to y\}$ bindings when $a$ and $b$ happen to land in the same equivalence class.
\end{itemize}
But this brings us to the principal problem with this approach. When we augment an equivalence class for whatever reason, we have to hunt down all bindings that are affected and coalesce them. Which in turn may expose more obligations to merge classes until a fixpoint is reached. When we had a finite number of variable names to start with, this process will terminate.
\par Our algorithm must keep a worklist of outstanding unification necessities (i.e. unmerged terms) and the current up-to-date substitution with consolidated equivalence classes.
When the former list becomes empty, we have the desired mgu in our hands.
If we can establish a correct-by construction data type for substitutions that precludes the possibility of overlapping equivalence classes in the key- as well as the term set of the substitution, then we get an almost machine-checked algorithm.

\section {Groupoids}
Let's come back to the idea of equivalence classes. We noted that they merge when we encounter two variable names (or more precisely their classes) that must be unified. But how can we make the notion of the classes precise? I propose to build paths between variable names, thus obtain path-connected components in the set of variable names. The elements of the loop space then become the connected components and classes. \todo {check that this is indeed a groupoid.} $T_{\Omega}$ is then a functor between groupoids. Of course passage to quotients reverts $T_{\Omega}$ to a \Set monad again.
\par When applying the slogan of HoTT, that \emph {types are groupoids}, then we might wonder whether a substitution is a mapping from types to terms with variable sets that are (equivalence classes of) inhabitants of a type. Thus unification, by creating bridges between equivalence classes, could be seen as a way of defining \emph {higher inductive types} (HITs). \todo {does this in some way add computational content to HITs?}
\subsection {Conversion of variable names to inhabitants}
Suppose we can track the individual identifications between variable names in the course of the unification algorithm. (Remember that these are symmetric in nature, because we avoid variable-to-variable substitutions.) Then, upon successful termination of the algorithm we'll end up with the \emph {continents} (a.k.a. path-connected components) and its \emph {roards} between its \emph {cities} (a.k.a. paths between objects). Since the obtained unifications can represent any identity compatible with the groupoid structure, we can insist that they are actually the groupoid morphisms. Then we can instantiate each variable, (non-composite) morphism, and possibly higher homotopies, from a set of inhabitants and consider it as the synthetic blueprint of a new type. It will satisfy the defining equation by definition and the inhabitants can be pattern matched (path induction!) against.
\par While this may sound as a tedious way of defining a new type, there might be nice applications to it, who knows.

\section {Bibliography}
\begin{itemize}
\item Goguen 89
\item Burstall-Rydeheard
\item HoTT Book
\end{itemize}
\end{document}
